Skip to content

feat: Add nested hypergraph generators (Kim et al. 2023, Barrett et al. 2025)#683

Open
jg-you wants to merge 6 commits intoxgi-org:devfrom
jg-you:nestedness
Open

feat: Add nested hypergraph generators (Kim et al. 2023, Barrett et al. 2025)#683
jg-you wants to merge 6 commits intoxgi-org:devfrom
jg-you:nestedness

Conversation

@jg-you
Copy link
Copy Markdown

@jg-you jg-you commented Feb 19, 2026

Adds two generative models that explicitly parameterize nestedness: random_nested_hypergraph and simplicial_chung_lu_hypergraph.

Barrett et al.: Implementation of Algorithms 3 and 4 from the paper. Couldn't find much to re-use between this and chung_lu_hypergraph since the methods are fundamentally different.

Kim et al.: Implements the algorithm described in Section II here. Duplicate edges are prevented during facet generation (step 1) and after rewiring (step 3) using frozenset sets. I wasn't sure whether epsilon was fixed or could vary by order, so I made this argument a list or a float to support both cases.

Notes

  • Tried to match the codebase conventions for parameter names: d for edge size, p for probability, epsilon for retention, k1/k2 for degree/size sequences.
  • Seed reproducibility tests only assert same-seed determinism. There's a bug in the existing code, e.g., here, where different seeds could lead to the edges being identical nonetheless. Did not reproduce the pattern in the new tests as a result.

@kaiser-dan kaiser-dan added the new feature New feature or request label Feb 26, 2026
@kaiser-dan kaiser-dan changed the base branch from main to dev February 26, 2026 15:42
@maximelucas
Copy link
Copy Markdown
Collaborator

Thanks @jg-you !

Quick dispatching for a few points:

  • @leotrs, a test in stats is failing, can you have a look?
  • the test-bug Jean-Gab mentioned is related to how we check equality between two hypergraphs, if I understand correctly. Since then, we improve this in Improved the hypergraph equality method #671. @nwlandry you did this, can you check if we can update the test with our new equalities?

I can review random_nested_hypergraph, maybe @nwlandry you wanna review simplicial_chung_lu_hypergraph since you know chung lu better? Or anyone else

@leotrs
Copy link
Copy Markdown
Collaborator

leotrs commented Mar 5, 2026

The failing test (test_perfectly_separable_low_dimensions) was already fixed on dev — the assertions were loosened to check core cluster membership rather than exact cluster sizes (which vary across platforms due to ARPACK/LAPACK differences in eigsh).

Merging dev into the nestedness branch should resolve the CI failures.

@jg-you
Copy link
Copy Markdown
Author

jg-you commented Mar 10, 2026

Fixed, should be g2g


# Step 1: Generate m unique facets of size d
facets = set()
while len(facets) < m:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may go into an infinite loop if m is large and d large compared to N right?
If so, add some checks earlier on and throw an error to prevent this

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's fair! Actual condition is m > (n choose d).

How much optimization is worth it to the library?
I could implement sampling non-edges if m > (n choose d) // 2, which would give good performance to this algorithm in the sparse and dense regimes. Maybe there's another quick win for m \approx (n choose d) // 2.

@maximelucas
Copy link
Copy Markdown
Collaborator

Ok I reviewed random_nested_hypergraph. Left a few comments, mainly to adhere to the new guidelines for random number generators, which we recently adopted #689.

@jg-you
Copy link
Copy Markdown
Author

jg-you commented Mar 16, 2026

Ok I reviewed random_nested_hypergraph. Left a few comments, mainly to adhere to the new guidelines for random number generators, which we recently adopted #689.

Ty, all implemented except for the one comment I have a question about -- level of optimization desired.
Main tradeoff being code complexity vs speed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

new feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants